Genetic Epidemiology
○ Wiley
Preprints posted in the last 30 days, ranked by how well they match Genetic Epidemiology's content profile, based on 14 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.
Gunter, N. D.; Cardenas, A.; Kobor, M.; Gladish, N.; Rehkopf, D.; Dow, W.; Rosero-Bixby, L.; Hubbard, A. E.
Show abstract
Epigenetic clocks estimate biological age from DNA methylation patterns at CpG sites, providing robust predictions of mortality and morbidity risk. "Blue zones"--regions of exceptional longevity--offer a unique opportunity to investigate how biological aging diverges from chronological age. However, standard clocks are typically trained on large, heterogeneous datasets, reflecting average population trends rather than region-specific dynamics. Using data from the Costa Rican Longevity and Healthy Aging Study (CRELES), we profiled DNA methylation from residents of the Nicoya blue zone (n = 206) and a comparison population in other parts of Costa Rica (n = 875). We propose training a SuperLearner, an ensemble machine learning approach, on the non-Nicoyan Costa Ricans to optimize predictive performance across existing clocks and flexible machine learners. Theoretically justified by its Oracle property, SuperLearner performs asymptotically as well as the best candidate predictor in the ensemble, resulting in a weighted combination of algorithms used to predict age. We then used this trained model to construct a calibrated hypothesis test comparing residual age distributions between the blue zone region and the comparison population. Comparing our approach to the five top-performing epigenetic clocks (ranked by MSE) in the Costa Rican cohort, only SuperLearner suggested age deceleration (an average of [~] 1 year) in the non-Nicoyan reference group. Before calibration, SuperLearner showed the strongest evidence for slowed biological aging among blue zone Nicoyans, estimating a three-year reduction [Formula] in epigenetic age. Calibrating with non-Nicoyan Costa Ricans improved consistency between estimates in all clocks, decreasing the estimated aging advantage in Nicoyans to about two years [Formula]. This approach provides a robust framework for estimating longevity in distinct regions when a relevant comparison population is available.
Li, Y.; Cornejo-Sanchez, D. M.; Dong, R.; Naderi, E.; Wang, G. T.; Leal, S. M.; DeWan, A. T.
Show abstract
The genetic relationship between asthma and lung function may be dependent on age-at-onset (AAO) of asthma. We investigated whether the shared genetics between asthma AAO and lung function is dependent on AAO. Asthma cases from UK Biobank were subset according to their AAO and genetic correlation was used to obtain genetically homogeneous groups, i.e., [≤]20 (LT20), 20-40, and >40 (GT40) years. Association analysis and fine-mapping were performed to identify shared genetics between AAO groups and lung function. Mediation and quantitative trait locus (QTL) analyses were performed to identify mechanisms underlying shared genetic associations. Chr5, chr6, chr12, and chr17 each had one region that displayed a cross-phenotype replicated association with at least one AAO group and lung function. Overlapping credible sets obtained from fine-mapping were observed on chr5 and chr6. Mediation analyses demonstrated that for each region the proportion mediated through asthma on lung function was larger for asthma LT20 compared to 20-40 and GT40 suggesting that their effects on lung function were more strongly driven by this association. Tissue-specific QTL analysis revealed shared etiology on chr5 may be acting through SLC22A5 and C5orf56 which might play an important role in decreased lung function among individuals with earlier-onset asthma.
Han, J.; Deng, K.; Hong, Z.; Zhang, Z.; Godneva, N.; de Mutsert, R.; van Hylckama Vlieg, A.; Rosendaal, F. R.; Mook-Kanamori, D. O.; Zheng, J.-S.; Chen, Y.; Segal, E.; Li-Gao, R.; DIYUFOOD consortium,
Show abstract
Background and ObjectivesRecent large-scale studies have consistently linked healthy dietary patterns to improved cardiometabolic health; however, the underlying biological pathways remain largely unclear, especially in non-European populations. In this study, we leverage data from four population-based cohorts (UK Biobank, NEO study, GNHS, and 10K) to investigate both common and cohort-specific biological pathways linking healthy dietary patterns to cardiometabolic disease through multi-omics profiling. Material and methodsIn each cohort, we first assessed the associations between each of the five major dietary pattern scores (i.e., AMED, hPDI, DII, AHEI, and EDIH) and cardiometabolic disease risk using Cox or logistic regression models. To explore the potential mediating role, metabolomics and proteomics measurements were incorporated into the models. All models were adjusted for relevant confounders, and false discovery rate correction was applied to account for multiple testing. ResultsWith a total of 71,679 individuals without pre-existing cardiometabolic disease across four participating cohorts (UKB: 54,024, NEO: 4,838, GNHS: 3,201, and 10K: 9,616), we confirmed that adherence to healthy dietary patterns was associated with a 5-10% reduced risk of cardiometabolic disease. Three common biological pathways were identified: (1) mediation via large HDL particles and apolipoprotein F; (2) mediation via DNAJ/Hsp40 and triglyceride-rich lipoproteins; and (3) mediation via CRHBP-regulated HPA axis activity affecting triglyceride-rich lipoproteins. ConclusionsOur integrative multi-omics analysis across diverse populations identifies novel biomarkers that connect healthy dietary patterns with cardiometabolic risk. These findings deepen our understanding of the biological mechanisms underlying diet-related disease and hold promise for enhancing the development of precision nutrition interventions.
Radosavljevic, L.; Smith, S.; Nichols, T. E.
Show abstract
The UK Biobank (UKB) Brain Imaging cohort contains data from almost 100,000 subjects and has yielded invaluable understanding of the links between the brain and health outcomes and lifestyles. Much of the understanding of these links has come from exploring the association between Imaging Derived Phenotypes (IDPs) and other variables that are unrelated to brain imaging, so called non-Imaging Derived Phenotypes (nIDPs). When performing analysis of this kind, it is very important to control for well known confounding factors such as age, sex and socio-economic status, as well as confounds which are related to the imaging protocol itself. In previous work, we created a pipeline for constructing imaging confounds for use in statistical inference via a standard multivariate linear regression approach (Alfaro-Almagro et. al. 2021). However, this approach is problematic when the number of confounds exceeds the number of subjects, and is severely underpowered when the number of number of subjects is not much larger than the number of confounds. In this work, we perform a simulation study to evaluate 13 modelling approaches to account for confounds when their number is similar to or exceeds the number of subjects. Based on the simulation results, we recommend a ridge regression based permutation test for low sample sizes (n [≤] 50), a version of de-sparsified LASSO for intermediate sample sizes (50 < n [≤] 500), and multivariate linear regression aided by Principal Component Analysis (PCA) for larger sample sizes (n > 500). We also demonstrate the use of our recommended methodology on a real data example of finding associations between Alzheimers Disease (AD) and IDPs.
Opperbeck, A.; Wang, Z.; Rautiainen, I.; Heikkinen, A.; Kaprio, J.; Ollikainen, M.; Sebert, S.; Sillanpaa, E.
Show abstract
Biological ageing begins before birth, with early-life exposures shaping late-life health. These exposures drive health inequities early, yet specific exposures and the composition of the ageing exposome remain largely undefined. This gap may persist as the field lacks agnostic investigations accounting for non-linearity, interactions and subtle signals. We aimed to identify exposures predictive of epigenetic ageing accumulated during childhood and adolescence and explore the composition of the "missing" exposome. In the FinnTwin12 cohort (847 participants measured at ages 12, 14, 17, and 22), over 500 exposures (including lifestyle, green environments, air pollutants, and demographic factors) were analysed using exposome-wide association studies and data-driven ML models (Knockoff Boosted Tree, sNPLS and Boruta). Epigenetic age (blood DNA methylation at age 22) was estimated using GrimAge and DunedinPACE. Our exposure set explains [~]28% of the variance in epigenetic age (R2 GrimAge = 25.7%; R2 DunedinPACE = 30.8%). Predictors of increased epigenetic age included lifestyle and socioeconomic factors (smoking, alcohol use, youth unemployment), alongside green space, while tree cover, vegetation index, neighbourhood age structure and aerial black carbon emerged as predictors of decreased epigenetic age. Twin modelling revealed that unexplained variance - the missing exposome - consists primarily of environmental factors unshared by twin siblings, distinct from the substantial genetic component captured by our model. Our results underscore the need to expand the exposome approach and model non-linearities to reveal subtle environmental signals accumulating early in life. Because identified predictors include modifiable systemic factors, they offer opportunities to alter health trajectories and mitigate inequity early on.
Zhang, L.; Higgins, I. A.; Dai, Q.; Gkatzionis, A.; Quistrebert, J.; Bashir, N.; Dharmalingam, G.; Bhatnagar, P.; Gill, D.; Liu, Y.; Burgess, S.
Show abstract
Mendelian randomization has emerged as a transformative approach for inferring causal relationships between risk factors and disease outcomes. However, applying Mendelian randomization to disease progression - a critical step in validating pharmacological targets - is hampered by index event bias. This form of selection bias occurs because analyses of disease progression are necessarily restricted to individuals who have already experienced the disease event. Here, we present a comprehensive evaluation of statistical methods designed to mitigate index event bias, including inverse-probability weighting, Slope-Hunter, and multivariable methods. We compare the performance of these methods in simulations and applied examples. Inverse-probability weighting methods reduce bias, but require individual-level data and will only fully eliminate bias when the disease event model is correctly specified. Slope-Hunter performed poorly in all simulation scenarios, even when its assumptions were fully satisfied. Multivariable methods worked best when including genetic variants that affect the incident disease event. However, if these genetic variants also affect disease progression directly, then the analysis will suffer from pleiotropy. Hence, if the same biological mechanisms affect disease incidence and progression, then multivariable methods will have little utility. But in such a case, analyses of disease progression are less critical, as conclusions reached from analyses of disease incidence are likely to hold for disease progression. Our findings indicate that no single method is a universal solution to provide reliable results for the investigation of disease progression. Instead, we propose a strategic framework for method selection based on data availability and biological context.
Lalaurie, C.; Liu, L.; Khan, A.; Wang, C.; Rich, S.; Barr, R. G.; Bernstein, E.; Kiryluk, K.; McDonnell, T. C. R.; Luo, Y.
Show abstract
Anti-{beta}2-glycoprotein I (anti-{beta}2GPI) antibodies are central to the pathogenesis of antiphospholipid syndrome (APS), an autoimmune disease characterized by a strong predisposition to venous thromboembolism (VTE). In this study, we conducted a multi-ancestry genome-wide association study (GWAS) of quantitative total anti-{beta}2GPI levels in 5,969 participants enrolled in the Multi-Ethnic Study of Atherosclerosis (MESA) and identified a genome-wide significant association at the APOH locus. Paradoxically, genetically determined increases in anti-{beta}2GPI levels at this locus were associated with lower VTE risk. Fine-mapping and functional genomics prioritized the missense variant rs1801690 (W335S) in {beta}2GPI (apolipoprotein H, [APOH]) as the most likely causal variant. This variant has an allele frequency of 5-6% in European and East Asian ancestries but only 1% in African ancestries. Integrating prior experimental studies, molecular dynamics simulations and structure-based epitope prediction, we propose a dual-effect mechanism whereby W335S reduces thrombotic risk by disrupting phospholipid binding in Domain V, yet increases autoantibody production through conformational changes that enhance epitope exposure in Domains I and II. These findings mechanistically uncouple autoantibody formation from thrombotic risk in carriers of the W335S variant, and suggest that APOH genotype may represent a clinically relevant genetic biomarker with potential utility for thrombotic risk stratification in anti-{beta}2GPI-positive individuals.
Berg, N. v. d.; Natalle Lopes, G.; Bogaards, F.; Beekman, M.; Amaro Junior, E.; Deelen, J.; Slagboom, P. E.
Show abstract
The biomarker MetaboHealth represents a novel indicator of overall health in middle age and may potentially be suitable as actionable health check in prevention strategies. MetaboHealth is a blood-based metabolomic composite score that predicts a wide range of age-related conditions and mortality in large European cohorts. Here, we investigated whether MetaboHealth can be personalised and limited to clinically validated metabolomic markers. Next, we assessed whether the updated MetaboHealth score predicts all-cause mortality and cardiometabolic disease incidence and can be improved by a lifestyle intervention. To personalise MetaboHealth, we scaled the metabolomic markers using a Dutch reference population (i.e. the Biobanking and BioMolecular Research Infrastructure Netherlands) and, in addition, based the score solely on clinically validated metabolic markers. The novel version of the score, Personal-MetaboHealth, retained predictive accuracy for all-cause mortality and showed an even stronger association with incident cardiometabolic disease in the Leiden Longevity Study (LLS) in which 2,404 participants were followed for up to 22 and 16 years for mortality and morbidity, respectively. The association of Personal-MetaboHealth with all-cause mortality remained robust after adjusting for smoking, alcohol use, and medication, while the cardiometabolic disease association was partially driven by smoking. Each standard deviation decrease in Personal-MetaboHealth was associated with a 11.7 year earlier onset of the first cardiometabolic disease in the LLS. Next we showed that Personal-MetaboHealth can be improved by a 3-month combined lifestyle intervention in middle aged individuals (Growing Old Together study), specifically in those at risk with an unhealthy score at baseline. Personal-MetaboHealth thus offers a potential actionable health check in middle age for early prevention and extension of healthy lifespan.
Mundo, S.; Grabowska, M.; Dickson, A.; Xin, Y.; Serley, S.; Li, B.; Stein, C. M.; Wei, W.-Q.; Feng, Q.
Show abstract
Drug repurposing offers the opportunity to identify promising drug targets efficiently using existing data, but there are currently limitations to these efforts; there is a particular need for versatile, but rigorous high-throughput approaches. As such, we developed a flexible, high-throughput, Mendelian randomization (MR)-based drug repurposing pipeline with three stages: 1) MR-based identification, 2) MR-based validation and prioritization, and 3) application. This pipeline can be applied to a broad range of clinical characteristics and diagnoses, including binary and continuous traits. Along with this flexibility, it offers rigorous quality control and validation. In Stage 1, the pipeline conducts MR analyses to identify proteins as potential drug targets (exposures) for a specified trait/condition (outcome). The MR analysis includes quality control steps, such as testing for heterogeneity, horizontal pleiotropy, and Bayesian colocalization. In Stage 2, MR analysis with quality control is conducted with significant results from Stage 1 (exposures) for either the same (external cohort only) or a related outcome. Drug targets with a consistent direction of association in Stages 1 and 2 are then assessed in Stage 3, which queries DGIdb, a database of druggable therapeutic targets. To demonstrate the utility and flexibility of this pipeline, we applied it to atherosclerotic cardiovascular disease. Using UKB-PPP cis-pQTLs as instruments for 2,923 circulating proteins, we assessed causal effects on LDL-C and triglycerides levels from the GLGC (Stage 1) and validated lipids-associated targets with a large coronary artery disease GWAS (Stage 2). Stage 3 mapped 6 proteins that interact with approved drugs, highlighting drug repurposing opportunities.
Hamilton, F. W.
Show abstract
In a recent article in Science, Shenhar et al. report that human life span heritability reaches [~]55% after removing "extrinsic" mortality, roughly seven-fold higher than recent large pedigree estimates. This conclusion rests on classifying deaths from infections and accidents as environmental noise independent of genetics. This premise is biologically untenable: susceptibility to severe infection is substantially heritable, with adoptee studies showing relative risks exceeding 5 for infection death when a biological parent died of infection. By encoding the assumption that extrinsic mortality is non-genetic directly into their Gompertz-Makeham model, removing it necessarily inflates heritability estimates. This creates selection bias rather than correcting for confounding and explains the contradiction with both pedigree studies and GWAS findings. The proposed heritability estimate is therefore not the true heritability of any population, past or present.
Howard, D. M.; Rabelo-da-Ponte, F. D.; Viejo-Romero, M.; Vassos, E.; Lewis, C. M.
Show abstract
Depression is a heterogeneous disorder, often diagnosed based on symptom co-occurrence. However, individuals may present with markedly different symptom profiles, potentially reflecting distinct underlying mechanisms. Identifying common patterns of symptoms using data-driven approaches could help clarify the heterogeneity of depression. Furthermore, examining the sociodemographic and lifestyle characteristics, health status, and polygenic scores of individuals with specific symptom profiles may offer insights into underlying risk factors. Unsupervised machine learning models were applied to large-scale data from the UK Biobank. Independent groups of individuals were assessed at two time points (the Mental Health Questionnaire: Q1; and the Mental Well-being Questionnaire: Q2) and reporting on historical or current episodes of depression. Two machine learning models, multivariate Bernoulli-mixtures and agglomerative hierarchical clustering, were used to identify common sets of symptoms and cluster individuals by symptom similarity. Consistency of results was examined between Q1 and Q2 and between clustering models. Associations between cluster membership probabilities and sociodemographic and lifestyle factors (sex, age, body mass index, smoking status, ethnicity, and deprivation), eight health conditions, and polygenic scores for bipolar disorder, schizophrenia, and attention-deficit/hyperactivity disorder (ADHD) were examined using regression models. Symptom clusters were highly consistent across Q1 and Q2 (mean correlation > 0.81) and between machine learning models (Rand Index > 0.83). Clusters aligned with the existing clinical subtypes, atypical and melancholic depression, alongside other potentially novel clusters reflecting a range of different symptom profiles. Atypical clusters (hypersomnia with weight gain) appeared in both Q1 and Q2 and were associated with younger age and higher body mass index. Distinct clusters combining insomnia, weight gain, and having thoughts of death were associated with asthma, suggesting potential inflammatory dysregulation. Further clusters were characterised by psychomotor changes and showed strong associations with Parkinsons disease, both before and after the mental health questionnaire was conducted. These findings highlight robust and clinically meaningful symptom subtypes within depression and support the use of data-driven approaches to improve diagnostic refinement and inform personalised treatment strategies.
Dobbins, S. E.; Forner-Cordero, I.; Amigo Moreno, R.; Southgate, L.; Hobbs, K.; Moy, R.; Adjei, M.; Muntane, G.; Vilella, E.; Martorell, L.; Gordon, K.; Ostergaard, P. E.; Pittman, A.
Show abstract
Lipoedema is a chronic adipose tissue disorder mainly affecting women with excess subcutaneous fat deposition on the lower limbs, associated with pain and tenderness. There is often a family history of lipoedema, suggesting a genetic origin, but the contribution of genetics is not well studied. We conducted a genome-wide association study (GWAS) for this disorder in a clinically ascertained cohort from Spain and performed a meta-analysis with the UK lipoedema cohort GWAS. We then used the results of this study as a replication of the inferred UK Biobank "lipoedema phenotype" study. Whilst our meta-analysis alone did not identify any genome-wide significant associations, our clinical cohorts provide support for three loci identified through the UKBB study: the chr2q24.3 GRB14-COBLL1 locus (rs6753142, PMETA=1.64x10-6), chr6p21.1 VEGFA locus (rs4711750, PMETA=8.99x10-7) and the chr5q11.2 ANKRD55-MAP3K1 locus (rs3936510, PMETA=1.67x10-5). We identify numerous rare SNPs with strong association signals in our meta-analysis (P<1x10-6) with support in both UK and Spanish datasets, three of which also show nominal support in the UKBB (P<0.05). These findings provide a starting point towards understanding the genetic basis of clinical lipoedema and demonstrate the utility of the interplay of large-scale biobanks genetic data and clinically ascertained cohorts to elucidate the genetic architecture of lipoedema.
Liu, C.; Mayer, M.; Lactaoen, K.; Gomez, L.; Weissman, G.; Hubbard, R.
Show abstract
Hybrid controlled trials (HCTs) incorporate real-world data into randomized controlled trials (RCTs) by augmenting the internal control arm with patients receiving the same treatment in routine care. Beyond increasing power, HCTs may improve recruitment by supporting unequal randomization ratios that increase patient access to experimental treatments. However, HCT validity is threatened by bias from unmeasured confounding due to lack of randomization of external controls, leading to outcome non-exchangeability between internal and external control patients. To address this challenge, we developed a sensitivity analysis framework to assess the robustness of HCT results to potential unmeasured confounding. We propose a tipping point analysis that adapts the E-value framework to the HCT setting where trial participation rather than treatment assignment is subject to confounding. To aid interpretation, we also introduce a data-driven benchmark representing the strength of unmeasured confounding reflected by the observed outcome non-exchangeability. We then propose an operational decision rule and evaluate its performance through simulation studies. Finally, we illustrate the approach using an asthma trial augmented by data from electronic health records. Simulation results demonstrate that our decision rule safeguards against Type I error inflation while preserving the power gains achieved by incorporating external data. In settings where moderate unmeasured confounding led to poorer outcomes for external controls, Type I error was controlled near the nominal 5% level, and power increased by 10-20% compared with analyses using RCT data alone. Our approach provides a practical, interpretable method to assess HCT robustness, supporting rigorous inference when integrating external real-world data.
Di, Y.; Cai, N.
Show abstract
Electronic health records (EHRs) have become the cornerstone of population-scale genetic studies1, but factors including patterns of healthcare use shape which and how diagnoses are recorded, leading to confounding effects in genetic associations with EHR codes2. In this study we propose EDGAR, a deep learning framework that recovers lifetime disease liability from EHR by aligning diagnostic codes with clinically validated measures and disease labels in a set of individuals prioritized through active learning. EDGAR yields representations that better capture disease-specific effects in genome-wide association analyses (GWAS). It also enables us to isolate a genetic factor that captures systemic biases in EHR codes, which distorts cross-disease correlations and drives spurious links with behavioral and socio-economic traits. We find that this factor generalizes across EHRs, and its identification in one EHR enables its removal from existing GWAS in another. Overall, our work presents a promising direction for improving specificity of EHR-based GWAS.
Senders, A. J.; Azarbarzin, A.; Kaffashi, F.; Loparo, K. A.; Redline, S.; Butler, M. P.
Show abstract
BackgroundObstructive sleep apnea (OSA), as measured by the Apnea Hypopnea Index (AHI), is associated with adverse outcomes. Measures that characterize the temporal variability in events may provide information over and beyond a simple summary of event frequency as measured by the AHI. Research QuestionTo assess whether temporal variability in the occurrence of obstructive apnea/hypopneas during the night is associated with all-cause mortality or incident cardiovascular disease (CVD). Study Design and MethodsData from the Sleep Heart Health Study (SHHS), a prospective multi-site community-based cohort were analyzed. For each person, the intervals between apnea/hypopnea events (inter-event interval; IEI) were used to calculate a coefficient of variation for their IEIs (IEI_CV). Risk for mortality (n=5,701) and incident CVD (n=4,373) were estimated by adjusted Cox proportional hazard models. Sensitivity analyses were conducted to test potential explanatory variables such as hypoxic burden and duration of uninterrupted sleep. ResultsIn 11.8 years of follow-up (median, IQR 10.6-12.2), 1,287 deaths occurred. After adjusting for potential confounders, including OSA severity, participants in the lowest quartile of IEI_CV (Q1) had a 40% higher risk of all-cause mortality compared with those in the highest quartile (Q4) (hazard ratio [HR] = 1.40; 95% confidence interval [CI], 1.20-1.64). In 11.5 years of follow-up (IQR 7.9-12.7), 867 CVD events occurred. The adjusted hazard rate for CVD was 29% higher (HR=1.29 [1.06-1.56]) for those with less variable IEI. Minimal reductions in effects sizes were observed after additional adjustment for hypoxic burden and additional novel and traditional covariates. In sensitivity analyses, adjusting for the longest bout of uninterrupted sleep without respiratory events attenuated the association for CVD incidence (HR=1.15 [0.89-1.50]). InterpretationThe temporal distribution of respiratory events - specifically, less variability in inter-event intervals (more regular event occurrences) - is associated with higher mortality and incident CVD.
Robertson, J. A.; Krätschmer, I.; Richmond, A.; McCartney, D. L.; Bajzik, J.; Vernardis, S.; Corley, J.; Tomlinson, S. J.; Vieno, M.; Chybowska, A. D.; Grauslys, A.; Smith, H. M.; Brigden, C.; Messner, C. B.; Zelezniak, A.; Ralser, M.; Russ, T. C.; Pearce, J.; Cox, S. R.; Robinson, M. R.; Marioni, R. E.
Show abstract
Ambient air pollution has been associated with increased incidence of chronic disease and is estimated to contribute towards 4.2 million early deaths annually. Whilst the health impacts are well described, less is understood about the underlying biological mechanisms, particularly when considering the co-occurrence of multiple pollutants. Using an atmospheric chemistry transportation model (EMEP4UK), we generate pre-baseline sampling pollution exposure estimates for eight pollutants in Generation Scotland (N = 22,071, recruited between 2006 - 2011). Cox-proportional hazard models reveal associations between pollution exposure and all-cause dementia (PM2.5) and myocardial infarction (NO3_Coarse) over 18 years of follow-up. We perform Bayesian multivariate epigenome-wide (N = 18,512, Illumina EPIC v.1) and proteomic (N = 15,314, 133 mass-spectrometry proteins) association studies, revealing 11 pollutant-methylation associations and 140 pollutant-protein associations. We identify positive associations between exposure (PM2.5 and NO3_Fine) and epigenetic age-acceleration (PhenoAge epigenetic clock). Furthermore, we explore the development of pollutant EpiScores, assessing these in holdout and independent test sets. Our results enhance knowledge of molecular correlates of air pollution exposure, whilst providing further evidence of contributions of air pollutants to chronic disease.
Yang, C.; BioBank Japan Project, ; Namba, S.; Matsuda, K.; Okada, Y.; Moran, L.; Vincent, A.; Marques, F. Z.
Show abstract
BackgroundSex hormone alterations, such as estrogen deficiency or testosterone excess, substantially increase cardiovascular disease (CVD) risk in females. Dietary fibre and its microbial by-products, short-chain fatty acids (SCFAs), have cardioprotective effects, but it remains unclear whether these benefits extend to females with an altered sex hormone profile. In this study, we aim to investigate whether dietary fibre intake, measured via plasma acetate--the most abundant SCFA--is associated with improved cardiovascular outcomes in females with altered sex hormone profiles. MethodsThis cohort study included 116,235 female participants from the UK Biobank and Biobank Japan with up to 10 years of follow-up. We analysed early menopause (as a surrogate for estrogen insufficiency) and plasma free testosterone (in a subset). The primary outcome was major adverse cardiovascular events (MACE). Secondary outcomes were blood pressure. Proteomics analyses explored potential mechanisms. ResultsAcetate levels were associated with lower 10-year MACE incidence (-0.618/1000 woman-year, HR=0.900, p=0.002) and systolic blood pressure (-0.231 mmHg per 1 SD, p<0.001) in the UK Biobank. High acetate levels attenuated the increased MACE risk associated with early menopause (HR=1.158, p=0.057) compared with low acetate (HR=1.425, p<0.001), with similar patterns replicated in Biobank Japan (high: HR=1.322, p=0.090; low: HR=1.385, p=0.042). Proteomics analyses suggested a mechanism involving pro-inflammatory proteins. Moreover, high acetate levels attenuated the increased MACE associated with elevated free testosterone in the UK Biobank (high: HR=1.238, p=0.024; low: HR=1.056, p=0.666). A significant interaction between acetate and free testosterone on systolic blood pressure indicated that the effect of rising testosterone on blunting acetates effect ({beta}=0.167, 95% CI: [5.212x10-2-2.818x10-1], p=0.004) was partially mediated by central obesity (waist-to-hip ratio). ConclusionsHigher plasma acetate levels were associated with lower cardiovascular risk, particularly in females with early menopause or elevated free testosterone, potentially via inflammatory pathways. These findings underscore the importance of hormonal context in shaping cardiometabolic resilience and support personalised CVD prevention strategies for females with altered sex hormone profiles, including increasing dietary fibre intake.
Fransquet, P. D.; Yu, C.; Tran, C.; Hussain, S. M.; Bousman, C.; Nelson, M. R.; Tonkin, A. M.; McNeil, J. J.; Lacaze, P.
Show abstract
AimsLow-dose aspirin is no longer routinely recommended for primary prevention in older adults because bleeding risks outweigh cardiovascular benefits. We aimed to investigate whether polygenic scores (PGSs) could modify the effects of aspirin on major bleeding and major adverse cardiovascular events (MACE) in a trial of older individuals. MethodsWe conducted post-hoc genetic analysis of the Aspirin in Reducing Events in the Elderly (ASPREE) randomized, placebo-controlled trial in Australia and the United States. Participants aged [≥]70 years ([≥]65 years for U.S. minorities) without cardiovascular disease, dementia, or physical disability were randomized to 100 mg daily aspirin or placebo. Among those with high-quality genotyping data (n=13,571; median follow-up 4.6 years), we tested 572 cardiovascular- and hematologic-related PGSs for interaction with aspirin using Cox proportional hazards models, applying Bonferroni correction. ResultsA triglyceride-related PGS (PGS003144) modified aspirins effect on major bleeding (interaction P=5.9x10-5; Bonferroni-adjusted P=0.034). In the lowest PGS quintile, aspirin increased major bleeding compared with placebo (hazard ratio [HR] 2.28; 95% CI 1.45-3.58) without reducing MACE (HR 1.04; 95% CI 0.67-1.62). In contrast, in the highest quintile, aspirin was associated with lower risks of major bleeding (HR 0.62; 95% CI 0.38-0.97) and MACE (HR 0.66; 95% CI 0.44-0.99). Baseline measured triglyceride levels demonstrated a similar pattern of effect modification. ConclusionA triglyceride-related PGS identifies older adults with divergent bleeding and cardiovascular responses to aspirin, supporting the potential role of genetically-informed strategies for primary cardiovascular prevention. Lay summaryThis study shows that genetic differences related to triglyceride levels may help identify older adults who are more likely to be harmed or to benefit from taking aspirin to prevent heart disease. O_LIIn older adults with certain genetic profiles linked to triglycerides, aspirin increased the risk of serious bleeding without reducing heart attacks or strokes, while in others it was associated with lower risks of both bleeding and cardiovascular events. C_LIO_LIUsing genetic information alongside traditional risk factors could help tailor aspirin use for primary prevention, avoiding unnecessary harm while identifying those most likely to benefit. C_LI Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=140 SRC="FIGDIR/small/26346656v1_ufig1.gif" ALT="Figure 1"> View larger version (60K): org.highwire.dtl.DTLVardef@7ac690org.highwire.dtl.DTLVardef@8256c2org.highwire.dtl.DTLVardef@10de40corg.highwire.dtl.DTLVardef@f6e5e9_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOFigure, Graphical Abstract:C_FLOATNO Genetic stratification of aspirin benefit and harm using a triglyceride polygenic score. Screening of 572 polygenic scores in the ASPREE trial identified a triglyceride-related PGS that modified aspirin-associated bleeding and cardiovascular risk. Aspirin increased bleeding risk in the lowest PGS quintile but reduced major bleeding and MACE in the highest quintile. Abbreviations: PGS, polygenic score; GI, gastrointestinal; IC, intracranial; MACE, major adverse cardiovascular events. C_FIG
Bustillo, A. J.; Zeki Al Hazzouri, A.; Glymour, M. M.; Kezios, K.
Show abstract
PURPOSEOver 6.9 million Americans above the age of 65 are living with Alzheimers Disease (AD) or related dementias (ADRDs), which are diseases characterized by cognitive decline and structural brain changes associated with accelerated brain aging. Cardiovascular risk factors, in particular hypertension, are well-studied risk factors for AD/ARD. Evidence suggests that the effects of hypertension on cognitive aging may vary by life stage, yet prior studies have focused on the effects of mid- or late-life hypertension or blood pressure, leaving other life stages, including early life, unstudied. However, owing to the logistical complexity of follow-up throughout the life course, cognitive aging cohorts lack early-life blood pressure exposure data and cognitive and brain aging outcome data in mid/late life. When such data are unavailable from any single data source, data fusion methods may be employed to pool two compatible data sources to impute an early-life blood pressure exposure history and produce a synthetic longitudinal cohort in which the associations between early-life blood pressure and mid/late-life cognition and brain aging can be estimated. The purpose of this work is to estimate the association between early-life blood pressure and mid- and late-life cognition and brain aging in a synthetic longitudinal cohort. METHODSWe pooled the Bogalusa Heart Study (BHS) to provide early-life blood pressure data (ages 4-16) and the CARDIA study to provide mid/late-life cognition & brain aging outcome data (ages 58-70) to generate a synthetic longitudinal cohort. Cognition was defined as cognitive domain scores (including executive function, memory, processing speed, and language) calculated by Z-transforming cognitive test scores within each cohort. Global cognition was calculated as the average of these Z-scores. Brain aging was defined using the Spatial Patterns of Atrophy for Recognition of Brain Aging, a measure of age-related brain atrophy using T1-weighted MRI scans. The cohorts overlapped in ages 17-57 for potential matching variables including blood pressure, sociodemographics, and vascular risk factors. Cognition overlapped between ages 41-58. We pooled data by distance-matching many-to-one (BHS to CARDIA) on mediators & confounders of each exposure-disease relationship that overlapped in age of measurement between the two cohorts. These variables included intermediate values of the exposure (blood pressure, ages 17-57), cognition (ages 41-58), in addition to sociodemographic and vascular risk factors. Linear regression models estimated the association between early life blood pressure & cognitive & brain aging outcomes. RESULTSBHS uniquely provided early life blood pressure data (ages 4-16), while CARDIA provided cognitive & brain aging data at ages 58-70. Matching is feasible between the ages of 17-57 on blood pressure, sociodemographics, and vascular risk factors, but 41-57 for cognition. CONCLUSIONSWe our results demonstrate the feasibility & suitability of two US-based cardiovascular cohorts for generating a synthetic lifecourse cohort to estimate early-life blood pressure and its association with mid/late-life cognitive & brain aging outcomes. Future studies should aim to use measures that more closely overlap between both cohorts. Additionally, future studies should interrogate greater spans, such as early life through late life.
Xie, R.; Bhardwaj, M.; Sha, S.; Peng, L.; Vlaski, T.; Brenner, H.; Schoettker, B.
Show abstract
BackgroundWhile multi-omics approaches, incorporating polygenic risk scores (PRS), metabolomics, and proteomics have shown promise in predicting major adverse cardiovascular events (MACE), their added value beyond cardiovascular disease (CVD) risk factors remains underexplored. We aimed to assess whether integrating multi-omics biomarkers into the SCORE2 model improves the prediction of MACE in apparently healthy individuals. MethodsThis study included 24,042 UK Biobank participants without CVD or diabetes mellitus, aged 40-69 years. Multi-omics biomarkers were fitted in sex-specific models including the variables of SCORE2 and 9 metabolites, 12 proteins, and a PRS for CVD in males, as well as 7 metabolites, 11 proteins, and a PRS for CVD in females. The performance of the SCORE2 model and its multi-omics extensions was compared using Harrells C-index and the net reclassification index (NRI) in a training and test set (70% and 30% of study population). ResultsIn 10-year follow-up, 1,204 MACE events occurred. Integrating multi-omics biomarkers into SCORE2 significantly improved the predictive performance (C-index: 0.708 to 0.769, P<0.001; NRI=26.2%). In males, the C-index improved from 0.682 to 0.752 ({Delta}C-index=+0.070, P<0.001; NRI=12.4%), while in females, it increased from 0.724 to 0.782 ({Delta}C-index=+0.058, P<0.001; NRI=30.4%). However, full multi-omics measurements may not be needed because the combination of proteomics and PRS yielded comparable performance in males (C-index=0.749) and females (C-index=0.782). ConclusionsIntegrating a protein panel and a PRS significantly improves MACE risk prediction by the SCORE2 model, which includes HDL and total cholesterol. Adding further metabolites has limited additional predictive value.